BioLizardStyleR

Introduction

In this guide, we’ll walk you through the installation of the BioLizardStyleR package and provide a brief overview of its functions using a practical example. The BioLizardStyleR package helps you make ggplot2 plots in the BioLizard style. It includes an integrated font installer, a dedicated ggplot2 theme, custom color functions, and a tool to add a BioLizard footer and export the plot.

1. Installation

To install the BioLizardStyleR package directly from GitHub, you will need to utilize the devtools package.

if (!requireNamespace("devtools", quietly = TRUE)) {
  install.packages("devtools")
}

devtools::install_github("lizard-bio/nature-grade-visualization-playground", subdir="BioLizardStyleR", build_vignettes = TRUE)
library(BioLizardStyleR)

2. Font Installation

The BioLizardStyleR package makes use of the signature BioLizard font ‘Lato’. This font can be found on Google Fonts under an open font license.

If not yet installed, Lato is automatically installed when calling ‘lizard_style()’. In case this automatic installation fails on your system, you can install it manually using the following steps:

  1. Download Fonts: Download the two font files from the GitHub repository Nature Grade Visualization Playground.
  2. Install Fonts: Install the downloaded fonts on your system.
  3. Complete: Once the installation is done, you’re all set to use the font.

Note that this installation is a one-time process and normally doesn’t need to be repeated in subsequent R sessions. Running the following function without any arguments should install the font. If you encounter issues due to a pre-existing installation of the extrafont package and need to clear the database, use the argument clearDatabase=TRUE. This will reinstall the extrafontdb package, a companion to extrafont, to reset its database.

install_biolizard_fonts()

3. Crafting a BioLizard-Styled ggplot

Transforming your ggplot to match the BioLizard style involves a three-step process. Let’s illustrate this on a basic ggplot.

library(ggplot2)
data("mtcars")
mtcars$gear <- factor(mtcars$gear, levels = c(3, 4, 5), ordered = TRUE)

testplot <- ggplot(data = mtcars, aes(x = hp, y = mpg)) + 
              geom_point(aes(color = gear),size=3) +  
              labs(title = "Miles per Gallon vs. Horsepower",  
                   x = "Horsepower",  
                   y = "Miles per Gallon",  
                   color = "Gears")
testplot

3.1 Applying the Biolizard theme

In the initial step, we implement the biolizard theme through the lizard_style() function. This not only modifies various theme settings but also adjusts the font. An important note here is that the lizard theme can only act as a starting template! Depending on your specific plot, title length, variables in use, and other factors, you might need to make further theme adjustments.

testplot <- testplot + lizard_style()
testplot

Tip: you can use theme_set(lizard_style()) at the beginning of your script or report to set the theme to lizard_style() in all plots, without having to add + lizard_style() to each plot individually. Te reset the theme to the ggplot2 default, you can call reset_theme_settings().

3.2 Adding a BioLizard color palette

Once you’ve styled your plot, you can enhance it with a ‘BioLizard’ color palette. You have the option to choose from qualitative, sequential, or divergent palettes tailored for either discrete or continuous variables. These palettes incorporate the distinct hues of the BioLizard house colors, while ensuring they are color-blind friendly and perceptually uniform. For a deeper understanding of each palette, it’s recommended to consult the documentation for the respective function.

The coloring functions within the package adhere to a flexible structure: scale_color/fill_biolizard. The type (discrete/continuous) and the scheme (qualitative/sequential/divergent) must be determined as function parameters.

data("mtcars")
mtcars$gear <- as.factor(mtcars$gear)

# Base plot
testplot <- ggplot(data = mtcars, aes(x = hp, y = mpg)) +
  geom_point(aes(color = gear), size=3) +
  labs(
    title = "Miles per Gallon vs. Horsepower",
    x = "Horsepower",
    y = "Miles per Gallon",
    color = "Gears"
  )

# Applying the qualitative color scale
testplot_qualitative <- testplot + scale_color_biolizard(type = "discrete", scheme = "qualitative") + lizard_style()
testplot_qualitative


# Applying the sequential color scale (assuming you have a different dataset or aesthetic)
testplot_sequential <- testplot + scale_color_biolizard(type = "discrete", scheme = "sequential") + lizard_style()
testplot_sequential


# Applying the divergent color scale (assuming you have a different dataset or aesthetic)
testplot_divergent <- testplot + scale_color_biolizard(type = "discrete", scheme = "divergent") + lizard_style()
testplot_divergent

It is possible to reverse the color scales by setting reverse = TRUE

testplot_divergent <- testplot + scale_color_biolizard(type = "discrete", scheme = "divergent", reverse = TRUE) + lizard_style()
testplot_divergent

The discrete color palettes can also be called directly, in case you would like to use them outside ggplot. The three base colors can also be used directly as blz_green, blz_blue and blz_yellow.

#use color palette
mycolors <- biolizard_pal_sequential(3, reverse = TRUE)
gearcolors <- mycolors[as.factor(mtcars$gear)]

plot(x = mtcars$hp, y = mtcars$mpg, pch = 19, col = gearcolors, family = "Lato")
legend("topright", legend = paste("Gears", 3:5), col = mycolors, pch = 19, bty = "n")

#use BLZ base colors
mycolors <- c(blz_green, blz_blue, blz_yellow)
gearcolors <- mycolors[as.factor(mtcars$gear)]

plot(x = mtcars$hp, y = mtcars$mpg, pch = 19, col = gearcolors, family = "Lato")
legend("topright", legend = paste("Gears", 3:5), col = mycolors, pch = 19, bty = "n")

3.4 Reset ggplot2 defaults

Note that calling lizard_style() will overwrite the default colors for points, lines, boxplots, areas, etc.. . To reset them to the ggplot2 defaults, you can either restart the R session or use reset_colors().

In the plot below, the points are shown in BioLizard green. Using reset_colors() will revert these changes:

p <- ggplot(mtcars, aes(mpg, disp)) + geom_point()
p

reset_colors()
p

4. Examples

4.1 Simple bar chart

x = 1:8
y = 1:8

df <- data.frame(x=x, y=y)
ggplot(df, aes(x=x, y=y, fill = factor(x))) +
  geom_col() +
  scale_fill_biolizard() +
  lizard_style()

4.2 Simple polygon plot


ids <- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3"))
values <- data.frame(
  id = ids,
  value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5)
)
positions <- data.frame(
  id = rep(ids, each = 4),
  x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3,
        0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3),
  y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5,
        2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2)
)

datapoly <- merge(values, positions, by = c("id"))
ggplot(datapoly, aes(x = x, y = y)) +
  geom_polygon(aes(group = id, fill = id)) +
  scale_fill_biolizard(type = "discrete", scheme = "sequential") +
  lizard_style()

4.3 Simple box plot

ggplot(data = mtcars, aes(x = as.factor(gear), y = mpg)) +
  geom_boxplot() +
  labs(
    title = "Miles per Gallon vs. Gears",
    x = "Gears",
    y = "Miles per Gallon"
  ) +
  lizard_style()

A violin plot is similar to a boxplot, but captures more information on the data distribution:

ggplot(data = mtcars, aes(x = as.factor(gear), y = mpg)) +
  geom_violin() +
  labs(
    title = "Miles per Gallon vs. Gears",
    x = "Gears",
    y = "Miles per Gallon"
  ) +
  lizard_style()

4.5 Simple density plot

ggplot(data = mtcars, aes(x = mpg)) +
  geom_density() +
  lizard_style()

4.6 Simple scatterplot


ggplot(data = mtcars, aes(x = hp, y = mpg)) +
  geom_point() +
  geom_smooth() +
  labs(
    title = "Miles per Gallon vs. horsepower",
    x = "Horsepower",
    y = "Miles per Gallon",
    color = "Miles per Gallon",
    fill = "Miles per Gallon"
  )
#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

4.7 Facets

Faceting is a great way to lay out related plots side-by-side, with the axes either on the same scale or on different scales.

# use facet_wrap to stratify plots by one variable
ggplot(mtcars, aes(x=mpg, y=hp)) +
  geom_point() +
  facet_wrap(vars(gear)) +
  lizard_style() +
  theme(axis.text.x = element_text(size = 8, angle = 90, vjust = 0.5))


# use facet_grid for two variables
ggplot(mtcars, aes(x=mpg, y=hp)) +
  geom_point() +
  facet_grid(rows = vars(gear), cols = vars(carb)) +
  lizard_style() +
  theme(axis.text.x = element_text(size = 8, angle = 90, vjust = 0.5))

Note that the minimal style makes it difficult to distinguish the different blocks from each other. Readability of the figure is improved by plotting the axes for each plot, but they are removed by facet_grid and facet_wrap when the plots are on the same scale. The similar functions facet_rep_grid and facet_rep_wrap from the lemon package are more suitable with the lizard style, since they will show all axes.

if(!require("lemon", quietly = TRUE)){
  BiocManager::install("lemon")
}

library(lemon)

# use facet_wrap to stratify plots by one variable
ggplot(mtcars, aes(x=mpg, y=hp)) +
  geom_point() +
  facet_rep_wrap(vars(gear)) +
  lizard_style() +
  theme(axis.text.x = element_text(size = 8, angle = 90, vjust = 0.5))


# use facet_grid for two variables
ggplot(mtcars, aes(x=mpg, y=hp)) +
  geom_point() +
  facet_rep_grid(rows = vars(gear), cols = vars(carb)) +
  lizard_style() +
  theme(axis.text.x = element_text(size = 8, angle = 90, vjust = 0.5))

4.8 Treemap plots

Treemap plots are a powerful way of visualizing hierarchical data such as GO terms, for example. The size of the rectangles corresponds to the value you want to show (e.g. p value), and related rectangles are grouped together. This example makes use of the Pokémon dataset from the highcharter package and is based on https://yjunechoe.github.io/posts/2020-06-30-treemap-with-ggplot/.

The example below also illustrates how the colorRampPalette function can be used to expand a discrete color scale if the number of colors is not sufficient. Note that a discrete, qualitative color scale may not be very useful for a large number of colors, as it will be hard to distinguish the individual colors from one another.

The treemapify package allows plotting treemaps with ggplot2.

if(!require("treemapify", quietly = TRUE)){
  BiocManager::install("treemapify")
}

if(!require("highcharter", quietly = TRUE)){
  install.packages("highcharter")
}
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo
#> Highcharts (www.highcharts.com) is a Highsoft software product which is
#> not free for commercial and Governmental use

library(treemapify)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

data("pokemon", package = "highcharter")
# Cleaning up data for a treemap
data <- pokemon %>% 
  select(pokemon, type_1, type_2) %>%
  mutate(type_2 = ifelse(is.na(type_2), paste("only", type_1), type_2)) %>% 
  group_by(type_1, type_2) %>%
  summarise(n = length(pokemon)) %>% 
  ungroup()
#> `summarise()` has grouped output by 'type_1'. You can override using the
#> `.groups` argument.

head(data)
#> # A tibble: 6 × 3
#>   type_1 type_2       n
#>   <chr>  <chr>    <int>
#> 1 bug    electric     4
#> 2 bug    fairy        2
#> 3 bug    fighting     3
#> 4 bug    fire         2
#> 5 bug    flying      13
#> 6 bug    ghost        1

# expand the biolizard qualitative palette
ncols <- length(unique(data$type_1))
BLZ_qual <- colorRampPalette(biolizard_pal_qualitative(8))(ncols)

ggplot(data, aes(area = n, label = type_2, fill = type_1,
                subgroup = type_1)) +
  ggtitle("Number of pokémon stratified by type") +
  # 1. Draw type_2 borders and fill colors
  geom_treemap() +
  # 2. Draw type_1 borders
  geom_treemap_subgroup_border() +
  # 3. Print type_1 text
  geom_treemap_subgroup_text(place = "centre", grow = T, alpha = 0.5, colour = "black",
                             fontface = "italic", min.size = 0,
                             family = "Lato") +   #need to add "Lato" manually as this text is not accessed in theme()
  # 4. Print type_2 text
  geom_treemap_text(colour = "white", place = "topleft", reflow = T,
                    family = "Lato") +   #need to add "Lato" manually as this text is not accessed in theme()
  
  # 5. BLZ style
  scale_fill_manual(values = BLZ_qual) +
  lizard_style() +
  theme(legend.position = "none")

However, this is not ideal, since the font of the text still has to be defined manually, and thus the lizard_style() function does not work very well for these treemaps, even though they are ggplot2-based. Another option is to use the ‘treemap’ library, and move away from ggplot2 altogether, which generally leads to more aesthetically pleasing plots:

if(!require("treemap", quietly = TRUE)){
  BiocManager::install("treemap")
}

library(treemap)

treemap(data, index=c("type_1", "type_2"), vSize="n", type="index",
        title="Number of pokémon stratified by type", 
        palette=BLZ_qual,  #set the BLZ color palette
        fontcolor.labels=c("#FFFFFFDD", "#00000080"), bg.labels=0, border.col="#00000080",
        fontfamily.title = "Lato", fontfamily.labels = "Lato", fontfamily.legend = "Lato")  #set the lato font

4.9 Chicago plots

Idea from “A ggplot2 Tutorial for Beautiful Plotting in R”.

if (!requireNamespace("readr", quietly = TRUE)) {
  # Install the package if it's not installed
  install.packages("readr")
}
library(ggplot2)
chic <- readr::read_csv("https://figshare.com/ndownloader/files/42307179") #An example dataset
#> Rows: 1461 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (3): city, season, month
#> dbl  (7): temp, o3, dewpoint, pm10, yday, month_numeric, year
#> date (1): date
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

g <- ggplot(chic, aes(x = season, y = o3,
                   color = season)) +
    labs(x = "Season", y = "Ozone") + 
    geom_violin(fill = "gray80", linewidth = 1, alpha = .5) +
    geom_jitter(alpha = .25, width = .3) +
    coord_flip()

g + lizard_style() +
  scale_color_biolizard()

if (!requireNamespace(c("corrr","forcats"), quietly = TRUE)) {
  # Install the package if it's not installed
  install.packages(c("corrr","forcats"))
}
library(corrr)
library(forcats)

corm <-
  chic |>
  dplyr::select(temp, dewpoint, pm10, o3) |>
  corrr::correlate(diagonal = 1) |>
  corrr::shave(upper = FALSE)
#> Correlation computed with
#> • Method: 'pearson'
#> • Missing treated using: 'pairwise.complete.obs'

corm <- corm |>
  tidyr::pivot_longer(
    cols = -term,
    names_to = "colname",
    values_to = "corr"
  ) |>
  dplyr::mutate(
    rowname = forcats::fct_inorder(term),
    colname = forcats::fct_inorder(colname),
    label = dplyr::if_else(is.na(corr), "", sprintf("%1.2f", corr))
  )

g <- ggplot(corm, aes(rowname, fct_rev(colname),
                     fill = corr)) +
      geom_tile() +
      geom_text(aes(label = label)) +
      coord_fixed(expand = FALSE) +
      labs(x = NULL, y = NULL) 

g + scale_fill_biolizard(type='continuous',scheme='divergent',limits = c(-1, 1),na.value = "white", name="Pearson\nCorrelation") +
        lizard_style() +
        theme(legend.position = c(.95, .8))
#> Warning: A numeric `legend.position` argument in `theme()` was deprecated in ggplot2
#> 3.5.0.
#> ℹ Please use the `legend.position.inside` argument of `theme()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

g <- ggplot(chic, aes(temp, o3)) +
        geom_hex(color = "grey") +
        labs(x = "Temperature (°F)", y = "Ozone Level", title= 'How Temperature Affects Ozone Levels',
             subtitle='Hexagonal Binning of Ozone Levels vs. Temperature in °F')

g <- g + lizard_style() +
      scale_fill_biolizard(type='continuous',scheme='sequential')
g


finalise_lizardplot(g,source="Source: The Chicago dataset (https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/)")

4.10 Interactive plots (DE analysis)

ggplotly can be used to make a ggplot interactive, for example to use in a html report. See an example of in interactive PCA plot and volcano plot below. To generate these plots, we used the per-gene read counts of the pasilla dataset.

Note: to succesfully install Pasilla, you might have to manually install libbz2 (libbz2-dev (deb), libbz2-devel (rpm), bzip2 (brew)).


if(!require("plotly", quietly = TRUE)){
  BiocManager::install("plotly")
}
#> 
#> Attaching package: 'plotly'
#> The following object is masked from 'package:ggplot2':
#> 
#>     last_plot
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following object is masked from 'package:graphics':
#> 
#>     layout

if(!require("pasilla", quietly = TRUE)){
  BiocManager::install("pasilla")
}
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:dplyr':
#> 
#>     combine, intersect, setdiff, union
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
#>     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
#>     get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
#>     match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#>     Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
#>     tapply, union, unique, unsplit, which.max, which.min
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> 
#> Attaching package: 'matrixStats'
#> The following objects are masked from 'package:Biobase':
#> 
#>     anyMissing, rowMedians
#> The following object is masked from 'package:dplyr':
#> 
#>     count
#> 
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#>     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#>     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#>     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#>     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#>     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#>     colWeightedMeans, colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#>     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#>     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#>     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#>     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#>     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#>     rowWeightedSds, rowWeightedVars
#> The following object is masked from 'package:Biobase':
#> 
#>     rowMedians
#> 
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:plotly':
#> 
#>     rename
#> The following objects are masked from 'package:dplyr':
#> 
#>     first, rename
#> The following object is masked from 'package:utils':
#> 
#>     findMatches
#> The following objects are masked from 'package:base':
#> 
#>     expand.grid, I, unname
#> 
#> Attaching package: 'IRanges'
#> The following object is masked from 'package:plotly':
#> 
#>     slice
#> The following objects are masked from 'package:dplyr':
#> 
#>     collapse, desc, slice
#> 
#> Attaching package: 'AnnotationDbi'
#> The following object is masked from 'package:plotly':
#> 
#>     select
#> The following object is masked from 'package:dplyr':
#> 
#>     select

library(pasilla)
library(plotly)

# get gene counts
datafiles <-  system.file("extdata", package="pasilla", mustWork=TRUE)
count.table <-  read.table(file.path(datafiles, "pasilla_gene_counts.tsv"), header=TRUE, row.names=1, quote="", comment.char="" )
head(count.table)
#>             untreated1 untreated2 untreated3 untreated4 treated1 treated2
#> FBgn0000003          0          0          0          0        0        0
#> FBgn0000008         92        161         76         70      140       88
#> FBgn0000014          5          1          0          0        4        0
#> FBgn0000015          0          2          1          2        1        0
#> FBgn0000017       4664       8714       3564       3150     6205     3072
#> FBgn0000018        583        761        245        310      722      299
#>             treated3
#> FBgn0000003        1
#> FBgn0000008       70
#> FBgn0000014        0
#> FBgn0000015        0
#> FBgn0000017     3334
#> FBgn0000018      308

# get metadata
SampleAnno = read.csv(file.path(datafiles, "pasilla_sample_annotation.csv"))
head(SampleAnno)
#>           file condition        type number.of.lanes total.number.of.reads
#> 1   treated1fb   treated single-read               5              35158667
#> 2   treated2fb   treated  paired-end               2         12242535 (x2)
#> 3   treated3fb   treated  paired-end               2         12443664 (x2)
#> 4 untreated1fb untreated single-read               2              17812866
#> 5 untreated2fb untreated single-read               6              34284521
#> 6 untreated3fb untreated  paired-end               2         10542625 (x2)
#>   exon.counts
#> 1    15679615
#> 2    15620018
#> 3    12733865
#> 4    14924838
#> 5    20764558
#> 6    10283129

# DE analysis with edgeR
if(!require("edgeR", quietly = TRUE)){
  BiocManager::install("edgeR")
}
#> 
#> Attaching package: 'limma'
#> The following object is masked from 'package:DEXSeq':
#> 
#>     plotMA
#> The following object is masked from 'package:DESeq2':
#> 
#>     plotMA
#> The following object is masked from 'package:BiocGenerics':
#> 
#>     plotMA
library(edgeR)

Group <- as.factor(c(rep("untreated", 4), rep("treated", 3)))
y <- DGEList(counts = count.table, group = Group)

# filter genes
keep <- filterByExpr(y)
y <- y[keep, , keep.lib.sizes = FALSE]

# normalize counts
y <- calcNormFactors(y)
norm.counts <- cpm(y, normalized.lib.sizes = TRUE, log = T)

# plot interactive PCA
mds <- plotMDS(y, gene.selection="common", plot = F)
toplot <- data.frame(Dim1 = mds$x, Dim2 = mds$y, Group = Group, sample = colnames(mds$distance.matrix.squared))
pcaPlot <- ggplot(toplot, aes(Dim1, Dim2, colour = Group)) +
  geom_point(aes(label = sample), size=3) + coord_fixed() +
  xlab(paste0("PC1: ", round(mds$var.explained[1]*100), "% variance")) +
  ylab(paste0("PC2: ", round(mds$var.explained[2]*100), "% variance")) +
  scale_color_biolizard() +
  lizard_style()
#> Warning in geom_point(aes(label = sample), size = 3): Ignoring unknown
#> aesthetics: label
ggplotly(pcaPlot)

# fit general linear model
design <- model.matrix(~ 0 + Group)
colnames(design) <- levels(Group)
y <- estimateDisp(y, design)
fit <- glmQLFit(y, design)

# fit contrasts of interest
my.contrasts <- makeContrasts(treatedVsUntreated = treated-untreated,
                              levels=design)

lrt <- glmQLFTest(fit, contrast = my.contrasts)

DE <- topTags(lrt, n="all")$table
colnames(DE) <- c("log2FoldChange", "logCPM", "LR", "pvalue", "padj")
DE <- DE[order(DE$padj), ]
DE <- DE[!is.na(DE$padj), ]
head(DE)
#>             log2FoldChange   logCPM        LR       pvalue         padj
#> FBgn0039155      -4.602523 5.882320 1019.4151 3.758893e-11 2.976668e-07
#> FBgn0025111       2.906554 6.925429  743.3947 1.709017e-10 6.766851e-07
#> FBgn0029167      -2.189727 8.222828  562.8184 6.470046e-10 1.707876e-06
#> FBgn0035085      -2.548921 5.685708  500.7079 1.130499e-09 2.132208e-06
#> FBgn0003360      -3.171233 8.451316  482.6591 1.346261e-09 2.132208e-06
#> FBgn0034736      -3.500888 4.189188  406.1721 3.066611e-09 3.618419e-06

# interactive volcano plot
p <- DE %>% {
  ggplot(., aes(x = log2FoldChange, y = -log10(pvalue))) +
  geom_point(aes(color = padj < 0.05, label = rownames(.))) +
  scale_color_biolizard(reverse = TRUE) +
  lizard_style()
}
#> Warning in geom_point(aes(color = padj < 0.05, label = rownames(.))): Ignoring
#> unknown aesthetics: label
ggplotly(p)

4.11 Maps

note: to install ‘sf’ you might have to install libudunits2 and libgdal:

if (!requireNamespace("sf", quietly = TRUE)) {
  # Install the package if it's not installed
  install.packages("sf")
}
if (!requireNamespace("rnaturalearth", quietly = TRUE)) {
  # Install the package if it's not installed
  install.packages("rnaturalearth")
}
if (!requireNamespace("rnaturalearthdata", quietly = TRUE)) {
  # Install the package if it's not installed
  install.packages("rnaturalearthdata")
}

library("sf")
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
library("rnaturalearth")
library("rnaturalearthdata")
#> 
#> Attaching package: 'rnaturalearthdata'
#> The following object is masked from 'package:rnaturalearth':
#> 
#>     countries110

world <- ne_countries(scale = "medium", returnclass = "sf")

BLZ_offices <- c("Belgium", "Netherlands", "Switzerland", "United States of America")
world$BLZ_office <- ifelse(world$name_en %in% BLZ_offices, "yes", "no")
world$BLZ_office <- factor(world$BLZ_office, levels = c("yes", "no"))

ggplot(data = world, aes(fill = BLZ_office)) +
  geom_sf() +
  scale_fill_biolizard(type = "discrete", scheme = "qualitative") +
  lizard_style()

# coloring the ocean, as in the map on our website
ggplot(data = world, aes(fill = BLZ_office)) +
  geom_sf() +
  scale_fill_manual(values = c(blz_yellow, blz_green)) +
  lizard_style() +
  theme(panel.background = element_rect(fill = blz_blue),
        panel.grid = element_blank())

# europe only
ggplot(data = dplyr::filter(world, continent == "Europe"), aes(fill = BLZ_office)) +
  geom_sf() +
  scale_fill_manual(values = c(blz_yellow, blz_green)) +
  lizard_style() +
  theme(panel.background = element_rect(fill = blz_blue),
        panel.grid = element_blank()) +
  coord_sf(xlim = c(-25,50), ylim = c(35,70), expand = FALSE)